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ABSTRACT : 



When applied to a problem which has more than one local optimal 
solution, most nonlinear programming algorithms will terminate 
with the first local solution found. Several methods have been 
suggested for extending the search to find the global optimum 
of such a nonlinear program. In this report we present the re- 
sults of some numerical experiments designed to compare the per- 
formance of various strategies for finding the global solution. 



I. INTRODUCTION 



It is frequently the case in applied optimization studies that 
an algorithm which is known to converge to a global optimal solution 
under certain conditions (such as convexity) will be applied to a prob- 
lem which does not satisfy these conditions. In particular, optimiza- 
tion problems which are suspected of having several local optima in 
addition to the global optimum are often solved using algorithms which 
will stop and indicate a solution whenever any local optimum is reached. 
In such cases a useful strategy is to repeat the solution process sev- 
eral times starting from different initial points to avoid accepting 
a solution which is only a local optimum. This is probably the most 
frequently suggested strategy for avoiding local solutions. 

There are also other strategies for avoiding the local solutions 
in favor of the global optimum. This paper describes some numerical 
experiments which were done to compare the performance of several strat- 
egies for organizing such a global optimization. 

II . The Problem 

In order to develop and test strategies for avoiding local solu- 
tions it is necessary to specify a class of optimization problems to 
be considered. This paper will concentrate on the ’^essentially un- 
constrained" nonlinear programming problem 

minimize f (x) 
subject to X e Sc 



( 1 ) 



2 



where the local and global optimal solutions to (1) are known to occur 
in the interior of the set S. In such a problem the feasible region 
S determines a domain to be searched for solutions, but the boundaries 
of S do not determine the solutions. In this sense problem (1) can 
be considered "essentially unconstrained." 

Problems of this type arise frequently as the "unconstrained" 
subproblems in interior penalty function algorithms such as the Sequen- 
tial Unconstrained Minimization Technique of Fiacco and McCormick [3]. 
In the SUMT method, if the original nonlinear program is not a convex 
program, then the subproblem (1) may have local solutions which are 
distinct from the global solution. 

For problems like (1) a local optimal solution can be obtained 
by applying any of the efficient unconstrained descent algorithms 
(such as the Davidon-Fletcher-Powell method) to minimize the function 
f(x) while being careful not to penetrate the boundary of S. We 
shall now consider several strategies which try to ensure that the 
local solution we finally accept is, in fact, a global minimum. 

III. Strategies For Avoiding Local Solutions 

Six different strategies for organizing a global optimization 
are compared in this paper. These are briefly described below with 
references to more complete descriptions when they exist. 

Strategy SI (From the folklore) 



a. Set k = 1. 
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b. Let X be a vector chosen at random in the search 

k. 

region S. Starting at x perform an unconstrained 

minimization search on the function f (x) terminating 

k* 

at the local minimum x 

c. Replace k with k + 1 and go to step b. At each 
stage retain the best local solution obtained to date. 



SI is the strategy suggested in section I. Intuitively the problem 

with this strategy is that it may repeatedly search to the same local 

k 

minimum if the starting points x happen to be chosen within the 
"range of attraction" of that local minimum. The next three strate- 
gies attempt to solve this problem. 



Strategy S2 

a. Set k = 1 , f* = + o® 

b. Randomly select points x e S until one is found with 

* k 

f (x) < f . Call this point x . 

k 

c. Starting at x perform an unconstrained minimization 

k* 

search terminating at a new local minimum x 

* k* 

d. Set f = f (x ) , replace k with k + 1, and go to 
step b. 



In S2 a minimization (step c.) is initiated at x only if 






is smaller than the best solution found to date. Hence, each succes- 
sive minimization gives a new local minimum which is better than any 
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found so 


far. The same local minimum cannot be located twice. It is. 


however, 


much more difficult to determine the starting points x for 


strategy 


S2 than for SI. 


Strategy 


S3 (Bocharov [1]) 


a. 


Choose randomly in S. Set k = 1. 


b. 


, Starting from x perform an unconstrained minimiza- 

tion terminating at the local minimum x 


c. 


k n 

Choose a direction d e E at random and consider 

k* k 

f(x + ad ) as the positive scalar a increases. 

k?*c k 

Moving away from x in direction d , the function 

k* 

f must initially increase (since x is a local 


d, 


minimum) . Continue to increase a until f begins 
to decrease when a = a^, 

_ ^ k+1 k’^ , k-k 1 . T j 

Let X = X + a d , replace k with k + 1 , and 

go to step b. 


Strategy 


S4 (Bocharov [1]) 

S4 is the same as S3 except that in step c, instead 
of choosing the direction at random, d is chosen 

to be the direction of overall progress from the most 
recent minimization 

k k’*' k 

d = X - X (2) 



Both S3 and S4 attempt to prevent repeated minimization to the 
same local optimum by moving out of the region of attraction of the 
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most recent local solution before starting the next minimization. By 
continuing in the direction (2), strategy S4 hopes to also avoid 
local minima detected before the most recent minimum. 

Strategies S5 and S6 are considerably different from the 
first four methods. While SI - S4 attempt to choose good starting 
points for repeated local minimizations, S5 and S6 attempt to 
gain information about the entire search region S, gradually concen- 
trating their attention on portions of S which are in some sense 
"likely'* to contain the global minimum. S5 and S6 are most easily 
described for problems where S is determined by lower and upper 
bounds on each variable: 

S={xeE^ I , i=l,...,n} 

' 1 1 1 

For ease of presentation we will restrict our attention to such prob- 
lems . 

Strategy S5 (Piecewise Coordinate Projection - Zakharov [5]) 

a. Set up an initially empty list of points, and let 
S = {x G I ^ X. ^ L, , i = l,...,n} be the 

"remaining feasible region." Let S * S initially. 

k ^ k 

b. Randomly choose N points x e S , compute f(x ) 

for each, and adjoin them to the list. 

c. For each coordinate x^ of x (i = l,...,n) separate 

the remaining feasible interval into m 



equal subintervals . 
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k t h 

Let = {x in the list whose i component is 

th ^ ^ 

in the j subinterval of 1} 

1 i 

= I (j-1) ^ ^ j (L^--£.^)/m} 

for i = l,...,n and j = Then X.t,X,_ ...,X, 

il iZ , im 

describe the projection of the list of points x into 

th 

the m sub intervals of the i coordinate axis. 

k I k 

d. By considering {f(x ) | x e (i = l,...,n ; 

j = l,...,m) select the sub interval set which 

is considered most likely to contain the global minimum 

(for details see Zakharov [5]). 

e. By redefining and delete the sub interval 

sets X . (j = 1, . . . ,m ; j ^ t) from the remaining 
s J 

feasible region. Delete all points in the list which 

are in a deleted sub interval. Go to step b. 

As the remaining feasible region S gradually shrinks, the 
global minimum will be more and more closely bracketed. The problem 
with this method is that the most promising subinterval must be deter- 
mined on the basis of the sample of points x chosen so far. There 
is always a chance that a sub interval chosen for deletion will, in 
fact, contain the global minimum solution, and once it is deleted 
it can never be recovered. 

Strategy S6 attempts to solve this problem by retaining the 
entire region S throughout and using a probabilistic allocation 
device to concentrate attention on areas in S which are most 
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promising. This algorithm is new and is still under development. 
Initial results show some promise, but considerable improvement is 
still necessary. 



Strategy 


S6 (Coordinatewise Allocation) 


a. 


Define a marginal probability distribution function 

on the feasible interval \t,yL.] of each coor- 
1 11 

dinate axis i = l,...,n. In the absence of other 
information, a uniform distribution seems reasonable 


b. 


for the initial distribution. 

Ic 

Randomly choose N points x e S and compute 
Ic 

f(x ) for each. The probability distribution 

functions govern these choices in that the i^^ 

Ic Ic 

component x^ of x is chosen as a random sample 

point from the distribution Thus, the deter- 

mine the allocation of trial points to various regions 
in S . 


c. 


Based on the results of the trials to date, modify 
the to increase the allocation of future points 

to regions considered likely to contain the global 
minimum. Go to step b. 


Strategy 


S6 can have many realizations depending on the method of 



handling step c. In the version of S6 reported in this paper, step 
c is performed as following for each coordinate i = l,...,n . 
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1. The feasible interval split into m sub- 

intervals . 

2. A "success" is defined as a value of f(x^) in the 

bottom 25% of all f (x ) values, and the ratios r. . 

iJ 

of the number of successes in subinterval j of coor- 
dinate i to the total number of points in subinterval 
j are computed for all i and j . 

3. The modified probability for sub interval j of coor- 
dinate i is given by p . . = r . . / ^ r . . the 

ij ^j=l ij 

normalized success ratio. 

Several improvements on this allocation scheme are being considered 
for future testing. 

In early tests it became apparent that performance of the var- 
ious strategies fluctuated considerably, depending on the particular 
test problem under investigation. For example, relative to the other 
strategies, S2 performed spectacularly on some problems but miserably 
on others. On closer examination it was found that S2 did well on 
problems for which the global f value was significantly lower than 
the local minima and for which the global region of attraction was 
quite large; that is, on problems which were rather easy to solve. 

This suggests the need for a benchmark strategy to be used for assess- 
ing problem difficulty. The benchmark strategy should have as little 
structure as possible. We have chosen to use the pure random search 
method for this purpose. 
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Strategy SO (Pure Random - Brooks [2]) 



a. Set k = 1. 

k k 

b. Randomly select x e S. Evaluate f (x ). 

c. Replace k with k + 1, Go to step b. At each stage 
retain the best f value found to date. 



This strategy may be regarded as a benchmark method since it makes no 
attempt to take advantage of the information gathered at previous stages. 
In this sense it is probably the most primitive strategy possible. 

We can use SO in two ways: 

1. If a strategy does not do considerably better than SO, 
it should be discarded. 

2. If a test problem is such that SO can solve it nearly 
as well as the other strategies, then the problem is 
not very difficult and probably is not useful for dis- 
criminating among strategies. 



IV. Computational Experiments 

A number of computational experiments were performed to compare 
the various strategies presented above. For each of the test functions 
employed, each strategy was run 30 times with different random number 
sequences. A run was allowed to continue until the algorithm had re- 
quired 1000 evaluations of the objective function f (x) . 
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Test problems with predictable local and global solutions were 
constructed using the objective function 



f(x) = - exp[(x-p^)' A^. (x-p^)] 

This function consists of the superposition of m modes, where mode 
j has depth c^ ^ position p^ £ E^, and shape and width deter- 

mined by the n x n negative definite matrix . Particular test 
functions were obtained by choosing the parameters c^ and p^ from 

a random number table. A. was chosen to ensure that the m modes 

J 

were narrow enough that they did not completely merge into one another. 

Strategies SI through S4 require an unconstrained minimizer. 
Since the purpose of the study is to compare global strategies, a min- 
imizer is desired which uses the same information as is available to 
the other strategies - function values but not derivatives. Powell’s 
derivative free method was selected [4]. 



V. Results 

The computational results obtained are summarized in Tables 1 
and 2. Table 1 gives characteristics of the test problems used. Table 
2 lists for each problem and for each strategy the best f value ob- 
tained after 200, 500, and 1000 function evaluations. Each value is 
the average of the 30 trials conducted for that problem and strategy. 
The percentage of the 30 trials which did not locate the global mini- 
mum after 1000 function evaluations is also given in Table 2. It is 
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Problem 


Number of 
Variables 


Number of 
Minima 


Value of Global 
Minimum 


A 


2 


4 


- 9.0 


B 


2 


10 


- 9,9 


C 


2 


10 


- 9.3 


D 


2 


10 


- 9.8 


E 


2 


10 


-13.0 


F 


5 


5 


- 9.4 


G 


5 


5 


-10.1 


H 


5 


10 


-10.0 


I 


5 


10 


- 8.9 


J 


5 


20 


-11.9 



Table 1 



Characteristics of Test Problems 
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Function 






SO 


SI 


S2 


S3 


S4 


S5 


S6 




best f after 


200 


- 8.6 


- 8.5 


- 9.0 


- 8.2 


- 8.6 


- 8.5 


- 8.7 


A 


best f after 


500 


- 8.8 


- 8.9 


- 9.0 


- 8.9 


- 9.0 


- 8.7 


- 9.0 


A 


best f after 


1000 


- 8.9 


- 9.0 


- 9.0 


- 9.0 


- 9.0 


- 8.8 


- 9.0 




% failures 




- 


0.0 


0.0 


0.0 


0.0 


20.0 


0.0 




best f after 


200 


- 9.0 


- 8.9 


- 9.7 


- 9.0 


- 9.5 


- 9.1 


- 9.1 


B 


best f after 


500 


- 9.6 


- 9.3 


- 9.8 


- 9.9 


- 9.9 


- 9.7 


- 9.8 


best f after 


1000 


- 9.7 


- 9.8 


- 9.9 


- 9.9 


- 9.9 


- 9.8 


- 9.9 




% failures 




- 


3.3 


0.0 


0.0 


0.0 


10.0 


0.0 




best f after 


200 


- 7.6 


- 8.3 


- 8.1 


- 8.8 


- 7.8 


- 7.8 


- 7.7 


r» 


best f after 


500 


- 8.0 


- 8.6 


- 8.2 


- 9.1 


- 8.5 


- 8.1 


- 8.0 


U 


best f after 


1000 


- 8.3 


- 8.9 


- 8.6 


- 9.2 


- 8.7 


- 8.2 


- 8.2 




% failures 




- 


33.3 


53.3 


3.3 


43.3 


83.3 


80.0 




best f after 


200 


- 8.6 


- 8.9 


- 9.2 


- 7.8 


- 7.4 


- 8.8 


- 8.8 




best f after 


500 


- 9.1 


- 9.5 


- 9.5 


- 9.4 


- 8.5 


- 9.2 


- 9.4 


JJ 


best f after 


1000 


- 9.4 


- 9.7 


- 9.6 


- 9.7 


- 9.6 


- 9.2 


- 9.6 




% failures 




- 


10.0 


30.0 


6.7 


33.3 


73.3 


33.3 




best f after 


200 


-10.2 


-10.1 


-11.8 


- 8.3 


- 9.5 


-10.9 


-10.2 


T7 


best f after 


500 


-11.6 


-12.1 


-12.8 


-10.5 


-11.2 


-12.6 


-12.3 




best f after 


1000 


-12.1 


-12.7 


-12.9 


-12.0 


-13.0 


-12.7 


-12.8 




% failures 




- 


10.0 


3.3 


30.0 


0.0 


6.7 


3.3 




best f after 


200 


- 0.3 


- 6.7 


- 5.0 


- 6.4 


- 5.8 


- 0.8 


- 0.8 


T? 


best f after 


500 


- 1.0 


- 7.9 


- 5.0 


- 8.0 


- 8.7 


- 2.9 


- 3.1 


r 


best f after 


1000 


- 1.5 


- 8.7 


- 5.6 


- 8.5 


- 8.9 


- 7.0 


- 7.5 




% failures 




- 


60.0 


86.7 


43.3 


33.3 


80.0 


76.7 




best f after 


200 


- 4.1 


- 7.4 


- 7.3 


- 7.1 


- 7.5 


- 5.0 


- 4.7 


n 


best f after 


500 


- 5.5 


- 9.3 


- 8.8 


- 9.7 


- 9.7 


- 8.3 


- 8.2 


Lj 


best f after 


1000 


- 6.1 


-10.0 


- 9.1 


- 9.9 


-10.1 


- 9.5 


- 9.3 




% failures 




- 


3.3 


56.7 


10.0 


0.0 


16.7 


40.0 




best f after 


200 


- 3.4 


- 7.6 


- 7.0 


- 6.8 


- 7.4 


- 3.7 


- 3.6 


H 


best f after 


500 


- 4.6 


- 8.3 


- 7.3 


- 8.7 


- 9.2 


- 6.3 


- 7.2 


best f after 


1000 


- 5.2 


- 8.9 


- 7.7 


- 9.2 


- 9.7 


- 8.2 


- 8.9 




% failures 




- 


73.3 


93.3 


56.7 


20.0 


60.0 


50.0 




best f after 


200 


- 3.9 


- 7.6 


- 6.3 


- 6.5 


- 6.7 


- 4.2 


- 4.2 


T 


best f after 


500 


- 4.7 


- 8.0 


- 7.4 


- 8.0 


- 7.8 


- 5.8 


- 5.3 


i 


best f after 


1000 


- 5.3 


- 8.8 


- 7.6 


- 8.4 


- 8.6 


- 6.9 


- 6.1 




% failures 




- 


10.0 


66.7 


33.3 


36.7 


80.0 


100.0 




best f after 


200 


- 3.3 


- 7.4 


- 6.3 


- 6.7 


- 6.5 


- 3.8 


- 3.6 


T 


best f after 


500 


- 4.1 


- 8.8 


- 6.6 


- 7.4 


- 8.1 


- 5.3 


- 4.6 


J 


best f after 


1000 


- 4.8 


- 9.7 


- 7.2 


- 8.8 


- 8.3 


- 7.4 


- 6.5 




% failures 




- 


43.3 


83.3 


66.7 


76.7 


73.3 


90.0 



Table 2. 



Test Results 
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difficult to obtain a single measure of performance for this kind of 
problem since we must balance speed of convergence against the chance 
that the global solution will be missed entirely. 

From these test results we can draw some general conclusions: 

1. Test functions A and B were not very challenging since 
SO did nearly as well as most other strategies. 

2. S2 seems to make rapid initial progress but frequently 
stops short of the global solution ~ it is not recom- 
mended. 

3. In general, SI, S3, and S4 perform about the same 
and better than the other strategies. 

4. S5 and S6 exhibit slow initial convergence. Both 
frequently tend to concentrate the search effort around 
a good local minimum which is not global. 

5. On difficult problems even the best of these methods 
will frequently fail to locate the global minimum. 

It is also interesting to examine the entire graph of the number 
of function evaluations versus the best function value obtained for 
each strategy. These curves are shown for test function H in Figure 
1. The results for function H are representative of those obtained 
for the other functions and serve to emphasize conclusions 2, 3, and 4 
above . 

In conclusion, it is appropriate to note that these six methods 
do not come near to exhausting the possible techniques for avoiding 
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local solutions. Methods which are hybrids of these and entirely new 
methods should be tested. In particular, we hope to develop an algor- 
ithm which allocates unconstrained minimizations to various regions 

Ic 

similar to the way strategy S6 allocates the individual points x . 
Such a method would combine the rapid local optimizing power of the 
minimization method with a global analysis of the feasible region. 




Figure 1. 



FUNCTION EVALUATIONS 
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